AITopics | safety index design rule

Collaborating Authors

safety index design rule

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning

Zhao, Weiye, He, Tairan, Li, Feihan, Liu, Changliu

arXiv.org Artificial IntelligenceMay-4-2024

Deep reinforcement learning (DRL) has demonstrated impressive performance in many continuous control tasks. However, one major stumbling block to the real-world application of DRL is the lack of safety guarantees. Although DRL agents can statisfy the system safety in expectation through reward shaping, it is quite challenging to design the DRL agent to consistently meet hard constraints (e.g., safety specification) at every time step. On the other hand, existing works in the field of safe control provide guarantees on the persistent satisfaction of hard safety constraints. However, the explicit analytical system dynamics models are required in order to synthesize the safe control, and the dynamics models are typically not accessible in DRL settings. In this paper, we present a model-free safe control algorithm, implicit safe set algorithm, for synthesizing safeguards for DRL agents that will assure provable safety throughout training. The proposed algorithm synthesizes a safety index (also called the barrier certificate) and a subsequent safe control law only by querying a black-box dynamic function (e.g., a digital twin simulator). Moreover, we theoretically prove that the implicit safe set algorithm guarantees finite time convergence to the safe set and forward invariance for both continuous-time and discrete-time systems. We validate the proposed implicit safe set algorithm on the state-of-the-art safety benchmark Safety Gym, where the proposed method achieves zero safety violations and gains 95% 9% cumulative reward compared to state-of-the-art safe DRL methods.

obstacle, safe control, safety index design rule, (11 more...)

arXiv.org Artificial Intelligence

2405.02754

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Learn With Imagination: Safe Set Guided State-wise Constrained Policy Optimization

Zhao, Weiye, Sun, Yifan, Li, Feihan, Chen, Rui, Wei, Tianhao, Liu, Changliu

arXiv.org Artificial IntelligenceAug-24-2023

Deep reinforcement learning (RL) excels in various control tasks, yet the absence of safety guarantees hampers its real-world applicability. In particular, explorations during learning usually results in safety violations, while the RL agent learns from those mistakes. On the other hand, safe control techniques ensure persistent safety satisfaction but demand strong priors on system dynamics, which is usually hard to obtain in practice. To address these problems, we present Safe Set Guided State-wise Constrained Policy Optimization (S-3PO), a pioneering algorithm generating state-wise safe optimal policies with zero training violations, i.e., learning without mistakes. S-3PO first employs a safety-oriented monitor with black-box dynamics to ensure safe exploration. It then enforces a unique cost for the RL agent to converge to optimal behaviors within safety constraints. S-3PO outperforms existing methods in high-dimensional robotics tasks, managing state-wise constraints with zero training violation. This innovation marks a significant stride towards real-world safe RL deployment.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2308.1314

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback